Welcome to my first article entry. Here I will give a step-by-step explanation of my implementation of a QR code scanner.
General
Before actually start the implementation, let's think about the general steps a QR code generator needs to do. Usually, we have as data a URL or more generally, a bunch of characters. In order to transform them into a black-and-white-only pattern, we need to encode them somehow. This is the first step. After this, we need to think about error correction. This is actually the most important step, as real-world QR codes come very rarely in a pure and non damaged format and thus usually some white/black squares are flipped or simply not visible. Thus, our initially encoded data bits need to be extended to include some redundancy that helps to recover from errors. Lastly, we need to layout all the generated bits in a square matrix inorder to produce the actual QR code. Therefore, a rough sketch of the different steps can be made:
- Encode
- Error Correction
- Layout
Of course, there is a lot more to it than this simple explanation. But I'll leave the details in the respective
sections below in order to make the discussion of them more vivid and not so dry. So let's start. Define two different classes to keep
everything clean: QRCodeGenerator, QRCodeConstants:
/*
Class for generating a QR code.
*/
class QRCodeGenerator {}
/*
Utility class used (mainly) for the encoding part.
*/
class QRCodeUtility {}
Encoding
Analyze input
The QR code specification divides the encoded into different classes. Depending on what you want to encode, a different class is used. The reason for this is that if you only use a very limited alphabet then more data can be encoded. Specifically, there are four different classes:
-
Numeric
Allows the encoding of the numbers 0 - 9 but nothing else.
-
Alphanumeric
Allows the encoding of the numbers 0 - 9, the characters A B C D E F G H I J K L M N O P Q R S T U V W X Y Z $ % * + - . / : and additionally the whitespace [BLANK]. As you see, no lowercase letters are included. This means, that for the string "Hello World" you need to use the byte encoding mode.
-
Bytes
Encodes arbitrary bytes.
-
Kanji
Encodes characters of the japanese alphabet.
In the following field you can enter a string see to which class of characters they belong. This input is also used in the following sections.
| Character | Numeric | Alphanumeric |
|---|
The string you entered can be encoded in alphanumeric mode.
From characters to bytes
Now that we know which class should be used for encoding, we need to actually encode the data. Because different classes use different encoding schemes and make the explanation quite boring and lengthy
| Token | Value | Encoded (Padded) |
|---|
Padding and length information
Append the length information and padding bits to the string. This yields the following:
Now that we have our data encoded, we need to convert it to bytes in order to add error correction to them.
Now the data bytes with the additional error correction terms.